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Abstract 

A new method for the statistical analysis of 3-D point processes, based on the family of Minkowski- 
Functionals , is explained and applied to modelled galaxy distributions generated by a toy-model 
and cosmological simulations of the large-scale structure in the Universe. These measures are 
sensitive to both, geometrical and topological properties of spatial patterns and appear to be 
very effective in discriminating different point processes. Moreover by the means of conditional 
subsampling, different building blocks of large-scale structures like sheets, filaments and clusters 
can be detected and extracted from a given distribution. 



1 Introduction 

The still unsolved question of the origin and evolution of the Large-Scale Structure (LSS) of the 
universe is a central topic in modern cosmology. The scientific approach to this problem is based on 
three columns: 



1. The distribution of matter in the Universe is assumed to be traced by luminous galaxies. Thus, 
observation of galaxies and measurement of their redshifts is, besides peculiar-velocity measure- 
ments, the only way to gain an idea of the matter distribution in our local area of the Universe. 

2. Todays powerful computer systems make the numerical simulation of statistical ensembles of 
theoretical models of structure formation possible. By this way effects of different dark matter 
models on the resulting structures can be studied. 

3. The comparison of observations with theoretical models is made in terms of statistical measures 
which can be applied to both, real and simulated galaxy distributions. Thus, models can either 
be sorted out or favoured and improved to fit observational results. 



1 



2 MINKOWSKI-FUNCTIONALS 



2 



The introduced statistical method belongs to the third column and offers a new approach for finding 
adequate measures which are capable of discribing and characterizing global and local features of 
galaxy distributions. It was first suggested by Mecke, Buchert and Wagner but only now numerical 
problems are overcome and first results can be gained. 

Popular measures in that field include N-Point Correlation Functions [^], Counts-in- Cells Void 
Probability Functions [12|, Percolation Analysis [13|, Minimal Spanning Trees Q, Genus of Isodensity 
Levels [^], Voronoi-foam Statistics |5|, and many more. 

After discussing the basic properties of Minkowski-Functionals in Section ^, the new statistical method 
is presented in Section |3|. Its sensitivity to different components of the LSS (clusters, walls, filaments) 
is investigated, using toy-models based on Voronoi-tesselations, and the selection of structures by the 
means of conditional subsampling is illustrated (Section |^. Finally the method is applied to a series 
of CDM-simulations in Section |5|, a short summary is given in Section ^ and an outlook on future 
prospects in Section |^ 



2 Minkowski-Functionals 

The Minkowski-Functionals Wiy have their origin in the mathematical theory of convex bodies and 
integral geometry^. In d dimensions there exist d -|- 1 of these functionals including geometrical and 
topological descriptors to characterize content (volume and surface), shape and connectivity of a body 
AcW^ : 

In the case of three dimensions we have: 



Wo 


= V{A) 


(volume) 




= S{A) 


(surface) 




= H{A) 


(shape) 




= G{A) 






= 47vX (A) 


(connectivity) 



Here, shape is expressed in terms of the integral mean curvature H oi a body's surface, and the 
integral Gaussian curvature G, related to the Euler-characteristic x via the Gauf^-Bonnet theorem, is 
a measure for the connectivity. 

Important properties of the Minkowski-Functionals include: 
Motion Invariance : W^iuA) = Wu{A) 

Additivity : W^{AUB) = W^{A) + W^{B) - W^{An B) 

more detailed discussion of the Minkowski-Functionals can be found in Q and The mathematical 

background of integral geometry is covered by H. 
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Thus, the measures are invariant against translations and rotations of the body or combinations u of 
the two. The Wi, of a body C, which is the union of two point sets A and B, obey a simple additivity 
relation that can be extended by induction to an arbitrary number of components of C. Except for 
the volume, all other measures are located on the surface of the considered body. 

Euler-characteristic 

The Euler-characteristic X = G/47r of a body A is related to its genus g 

X{A) = l- g{A) , 
so in three dimensions X can be expressed by : 

X = components — tunnels + cavities . 
In this way it can be used as a topological measure for the connectivity of a point set. 




X = 2 % = -1 X = 2 



Figure 1: Example: A pair of ski consists of two components and, unless there are holes or cavities inside a ski, 
an Euler-characteristic of x = 2 is found. The tennis racket has two "tunnels" and one component yielding ^ = — 1 
(however, a more realistic one would have a highly negative x)- Finally a tennis ball has a cavity and thus x = 2- 

3 Method 

3.1 Boolean Grain Model 

In a given distribution the points (galaxies) are represented by their cartesian coordinates Xj. The 
input data are the coordinates of all the N points inside a cubic volume C that is cut out of the 
point distribution. This cube is then scaled to an edgelength of one (unit cube Cu)- Outside this box 



3 METHOD 



4 




Figure 2: Each point is decorated with a baU Br- 

periodic boundary conditions are used, i.e. Cu is supplemented by 26 identical boxes around it. For 
the calculations, on every point a ball Br with radius r is centered (Boolean grain model). 

The Minkowski measures of the point set B{r) = U^iBr{xi) which consists of all the points inside the 
union of the balls are then calculated, partly using numerical methods. Actually the local contributions 
Wiy{r,Xi) to the functionals from the surface of each single ball Br{xi) (i=l,...,N) are determined and 
global measures arise by taking the mean of all the Wu{r,Xi) for a given radius r. It can be shown 
that these global results contain n-point correlations of every order n. 

The radius r of the balls is the parameter that is needed to analyze the distribution on different scales. 
For small radii most of the balls are isolated, thus yielding the measures of a ball {X =1). With 
growing r more and more balls intersect and connect to network-like structures (X < 0). Finally, the 
network turns into a body with enclosed cavities {X > 0) which then are filled when the cube C„ ist 
totally overlapped by B{r). 





Figure 3: Boolean grain model for a Poisson process for two values of r, corresponding to (a; = r/< D >) xi = 0.18 
(left) and X2 ~ 0.36 (right); < D > is the mean distance of points in the set. 
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3.2 Results for Poissonian Sample 

The mean values of the Minkowski- Functionals can be calculated analytically for a stationary Poisson 
process M . The radius is scaled by the mean distance < d > between the points in the cube C^. 





Figure 5: The distribution of the local Euler-characteristic. 



4 Analysis of Toy Models 
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4.1 Voronoi Tesselations 



A toy model for the artificial generation of structures, roughly similar in to the ones observed in the 
Universe, is based on Voronoi Tesselations: 




Figure 6: (from [^]) A Voronoi- Tesselation is a decomposition of space into convex, non-overlapping domains, 

which are generated asWigner-Seitz cells of a discrete distribution of points; 

a) Voronoi- Tesselation in two dimensions, b) example for a cell and its nucleus (d=3). 



A Poissonian distribution of points ("nuclei") inside Cu defines a pattern of Wigner-Seitz- cells, the 
tesselation. In a second step a number of A'^ test-points is randomly distributed inside Cu- Then these 
points are projected onto the walls, or edges, or vertices of the tesselation, thus generating sheets (or 
walls), filaments or clusters. For all these structure components one can choose a finite thickness and 
the ratio of points ending up in each of them. 



4.2 Walls, Filaments and Clusters 

The method is sensitive to different structures in a distribution of points. This can be illustrated 
by looking at the local contributions to the Euler-characteristic X on the surface S of each ball. As 
X is equivalent to the integrated Gaussian curvature of the considered body, it can be split into three 
contributions [P, coming from the uncovered surface (xsur), the intersection lines (xun) of two balls 
on S and the triplepoints (xTri) where three balls have a common point on S. Surface points, that 
belong to more than three balls have a low probability and can be neglected. 

While for walls, X includes contributions from lines and triplepoints over a wide range of x, hardly 
any triplepoints appear for filaments. The Euler-characteristic X has a narrow minimum for clustered 
distributions, but no distinct minimum for filaments. 
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Poisson Walls 




0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0 

X X 

Figure 7: Local contributions to x from the surface (xsur), the intersection lines (xLin) and the triplepoints (xTri)', 
{x = r/< D >). 

4.3 Conditional Subsampling 

Structures can be detected and analyzed by selecting only galaxies with a local contribution to a 
functional inside a given interval. 

As an example we consider a mixture of filaments (4000 points) generated in the Voronoi model and 
a Poissonian distribution (also 4000 points). We can extract the filament component by suitable 
conditions on the local contributions to the Minkowski- Functionals: Typical for filaments, e.g., is 
X ~ (see Figure 0and Figure 8); we have chosen the condition —0.1 < X < 0.1. 
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filoments + poisson 



filoments in mixture 





selection 1 



selection 2 





Figure 8: Original distribution, filament component and two selected samples at different radii r (corresponding to 
{x = r/< D >) xi = 0.28 ■,X2 = 0.48. The first selection consists of about 2000 points with more than 88% of them 
belonging to filaments. The second has 3400 points with 70% in filaments. The pictures show all points of the samples 
projected onto the x-y plane of the cube €„■ 
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4.4 Two-point distributions with similar two-point correlations 
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Figure 9: The quotient of X and the uncovered ratio of the surface 5* is a good measure to discriminate the two 
distributions 1 and 2 having similar 2-point correlation functions ^{x) on scales smaller than xq: ^{xo) = 1. The error 
bars shown are the standard deviations of the mean of the ball contributions in each sample. 



5 Analysis of CDM-Simulations 



n = 
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Figure 10: k" power 
spectra with n = 0, -1, -1,5 
[S. D.M.White, priv.comm.l 
10000 points (out of 10®) 
plotted for each sample 
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The results show clearly the different reaction of the Minkowski-Punctionals to the three spatial 
patterns. For a constant power spectrum oc many small clusters can form, while for negative n 
bigger structures appear. In the case of n = the ratio of galaxies in clusters is higher than for n < 0, 
so the volume and surface measures yield lower values, while the mean curvature remains positive. For 
negative n the points outside of clusters are part of a network generated by the corresponding balls 
in the grain model, thus adding high surface contributions and negative curvature values (curvature 
of intersection lines -ffLm < !). 
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6 Conclusions 

The introduced statistical method is based on a sohd mathematical background and no special as- 
sumptions about the sample have to be made. The mean values of the functionals over the balls in a 
sample are statistically unbiased descriptors and include topological and geometrical measures, con- 
taining n-point correlations of every order Q. They are statistically robust even for small samples and 
depend on the radius of balls which is the (single) diagnostic parameter in the Boolean grain model. 
The method is efficient in discriminating spatial patterns and selecting structures by the means of 
conditional subsampling. 



7 Outlook 

The calculation of the family of Minkowski-Functionals by using a Boolean model of penetrable 
grains and evaluating the local (per grain) contributions to the functionals Wi, ■ u>i on the surface 
of the overlapping union of grains is an effective way of characterizing the local as well as the global 
morphology of point processes, as we have just shown. 

As a result of the local calculation of the contributions to the MF's, this method offers a variety of 
different informations by conditional subsampling and by using various combinations of MF's. The 
latter is necessary as noted earlier [^J]: the Euler characteristic alone is not a preferable tool to dis- 
criminate similar point processes (compare Fig. 2 left and right panels in [^] and Fig. 11). In addition, 
geometrical information is needed, which is furnished by the other MF's. Only all 4 measures can 
(within the class of additive and motion invariant measures) completely characterize the morphology. 

The current numerical code to calculate the MF's performs with reasonable storage and CPU time (we 
need about 180-250 MB storage and 1-6 h CPU time, depending on machine and degree of clustering, 
to realize, e.g., a sample of 10^ points). 

In our statistics group (SFB 375/B3) we currently work on an alternative way to calculate the MF's: 
We use a grid and calculate the MF's by their contributions per grid cell on a sufficiently fine grid 
spacing. The resulting code is comparatively fast and storage efficient. Moreover, it implies another 
advantage: in addition to point samples we can consider density- or temperature fields (given on a 
grid) which renders the method applicable to smoothed cosmic density fields and microwave back- 
ground maps. Smoothing is another key-element of this method which defines one of our near future 
perspectives. 

We envisage two steps: 

1. We calculate the MF's of smoothed fields (employing, e.g. Gaussian filters). Introducing a second 
diagnostic parameter (e.g. a density threshold), this method covers the topology approach by Gott, 



Melott, Weinberg and collab. (see: |17] and ref. therein), where we use 4 functionals instead of just 
one. We here try to restore motion invariance (with respect to the discrete group of motions) and 
additivity, two important properties to allow for a local characterization of morphology. This method 



will also cover the topological/geometrical approach on a grid introduced in [18|. 



2. We will combine, so-called Koenderink measures ([15|,|16|,[|T^) with MF's. Koenderink measures 
provide a set of specific filters which extract different pattern recogition elements from a smoothed 
point set. For different smoothing lengths the performance of these filters displays an optimum on 
some smoothing scale. 
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The combination of these methods will provide a powerful tool for the morphological characterization 
of point processes, density- and temperature fields in scale-space. 

A further perspective concerns stereological applications of MF's (see, e.g., [Q), i.e., the possibility of 
extracting 3D informations from lower-dimensional data sets (such as pencil beam surveys). It will be 
necessary to analyse smoothed and unsmoothed Minkowski-measures in the three-dimensional space 
compared with those in projected distributions. 
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